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Abstract 

Constructing models to discover physics underlying magnanimous data is a traditional strategy in data 
mining which has been proved to be powerful and successful. In this work, a multi-optimized recurrent neural 
network (MRNN) is utilized to predict the dynamics of photosynthetic excitation energy transfer (EET) in 
a light-harvesting complex. The original data set produced by the master equation were trained to forecast 
the EET evolution. An agreement between our prediction and the theoretical deduction with an accuracy 
of over 99.26% is found, showing the validity of the proposed MRNN. A time-segment polynomial fitting 
multiplied by a unit step function results in a loss rate of the order of 1075, showing a striking consistence 
with analytical formulations for the photosynthetic EET. The work sets up a precedent for accurate EET 
prediction from large data set by establishing analytical descriptions for physics hidden behind, through 


minimizing the processing cost during the evolution of week-coupling EET. 
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I. INTRODUCTION 


Many physical laws were discovered based on 
data, such as Kepler’s three laws via large da- 
ta of celestial body motion observed by Tycho. 
Understanding the temporal evolution of photo- 
synthetic excitation energy transfer (EET) in a 
light-harvesting complex is an important topic of 
broad interest due to its nearly 100% photosyn- 
thetic conversion efficiency, providing an ideal op- 
tion for mitigating energy crisis|1—3]. Exact nu- 
merical simulations of the dynamics of EET in 
a light-harvesting complex, on the other hand, 
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requires enormous computational resources|4, 5], 
which tends to grow exponentially with the num- 
ber of simulated time steps and system size. 
Even though many techniques, such as the hierar- 
chy of equations of motion (HEOM) technique|6, 
7], multi-configurational time-dependent Hartree 
(MCTDH)[8], stochastic Liouville-von Neuman- 
n equation|9], quasi-adiabatic propagator path- 
integral (QUAPI)|10], and path-integral Monte 
Carlo[11] are available, they are inappropriate for 
examining long-time quantum dynamical evolu- 
tion. 


The current state of a quantum system is most- 
ly determined by and rooted in its early stages 
of evolution, which enables us to learn long-time 


evolution of EET in a light-harvesting complex 
from short-time dynamics without the costly and 
direct long-time simulations. Once a memory k- 
ernel is acquired, the Nakajima-Zwanzig general- 
ized quantum master equation (GQME)[12] pro- 
vides a broad and formally reliable prescription 
for achieving this goal[{13]._ Nonetheless, solving 
the GQME and directly computing the memory 
kernel for an arbitrary system is still challeng- 
ing. The transfer tensor method (TTM) can solve 
the GQME somehow, but it requires an external 
numerical methodology to provide a set of dy- 
namical mappings|14—16]. The interplay between 
machine learning and quantum physics, on the 
other hand, altered the current situation[17—19] 
by providing new concepts for modeling the evo- 
lution of EET in a light-harvesting complex[20], 
i.e. intuitively and directly learning from data set 
without explicit theoretical construction. Artifi- 
cial neural networks (ANNs) [21] have shown that 
complex functional dependencies in time series 
can be learned directly from data[22, 23]. This 
evades the great efforts to make theoretical analy- 
ses, which sometimes become unjustified (e.g., the 
weak coupling limit) or difficult to derive if just 
based on phenomenological observations. ANNs 
have been demonstrated to be capable of solving 
the master equations governing the dynamics of 
long-time dissipative open quantum systems|24]. 


The recurrent neural network (RNN) in par- 
ticular has an exceptional capacity to interpret 
an intricate temporal behavior. It can preserve 
historical information for future purpose of pre- 
diction by building a feedback loop that receives 
both the current stage’s input and the previous 
step’s output|25]. However, concern of gradient 
vanishing or exploding behavior due to multiple 
iterating operations, on the other hand, limits the 
use of RNNs in long-time scale applications|26]. 
To address this flaw, this work employs multi- 
optimized recurrent neural networks (MRNNS) 
rather than long short-term memory recurren- 


t neural networks (LSTM-NN)[27, 28], to mod- 
el the long-term dependencies of the time-series 
data set, by storing the key information, and pre- 
dicting the future data that is not currently avail- 
able. MRNNs are also used as propagators of the 
time-dependent master equations to regulate the 
light-harvesting complex over a range of time s- 
cales. 

The rest of the paper is organized as follows. 
The quantum processes and the common mas- 
ter equations for photosystem II reaction cen- 
ter (PSII-RC) are described in Sec.2.A, and a 
sample RNN architecture with optimized hyper- 
parameters is introduced in Sec.2.B. Results are 
discussed in Sec.3, where we validate the learn- 
ing model in the interval [0. 80]fs (Sec.3.A) and 
predict the EET evolution process in the range 
[80, 500]fs (Sec.3.B), using the polynomial fitting 
and the analytical expression(Sec.3.C). Finally, 
conclusions and outlook of the this work are sum- 
marized in Sec.4. 


Il. THEORETICAL MODEL AND MULTI- 
OPTIMIZED RECURRENT NEURAL NET- 
WORKS 


A. Theoretical model for Photosystem II 
reaction center (PSII-RC) 


A typical photosystem II reaction center 
(PSH-RC) often seen in purple bacteria and 
oxygen-evolving organisms (cyanobacteria, al- 
gae, and higher plants) comprises six pigment 
molecules located at the core of the complex with 
two symmetric branches of protein matrix[29]. 
Four chlorophylls of them (special pair PD, 
PD, and accessory ChID,, ChlD2) and two pheo- 
phytins (PheD,, PheDg2), are parallel distributed 
in these two branches of protein matrix|30]. The 
pair of chlorophylls, PD;and PD», located at the 
center of the PSII RC act as the primary electron 
donors, forming two excited states denoted as 


FIG. 1. (Color online) Energy-level framework for the photosyn- 
thetic RC with two load-transitions |a;)—>|8;);-1,2. The electronic 
transitions from ground state |g) to two coupled dipoles |e1) and |e2) 
are induced by the high temperature photon bath, while the low tem- 
perature phonon bath drives charge transfer from |ez) to |a;)i=1,2, 


and |3;);=1,2 to |g) with a termination of electronic circulation. 


|e1,2) in Fig.1. Two pheophytin pigments, PheD, 
and PheD, couple to the rest of the molecules 
and act as the electron acceptors|a,) and |ag), 
respectively|31], as shown in Fig.l. This is an 
energy-level framework abstracted from the PSII- 
RC[2], where |3;)(|2)) is a positively charged 
state after an electron is released and |g) is a 
ground state. 

After absorbing a solar photon, an electron is 
excited from |g) to |e2) with a transition rate yp, 
where the excited electron may transit to |e,) s- 
tate at a rate ye. Then the excited electron is 
transferred to the acceptors by emitting a phonon 
via two pathways: |e1)(|e2))— Jaz) or |e1)(|e2))—> 
la) at emission rates 71. and Y2, respectively, 
with |a) and |az) being the ion-pair states in 
these two pathways. Such a process is accompa- 
nied by a spatial separation of positive and nega- 
tive charges induced by the release of the excited 
electrons to the plastoquinone molecule, leaving a 
hole in the dimer. Then the electrons are released 
from the two acceptors PheD, and PheD» denot- 


ed by |a,) > |61) (Path 1) and Jaz) > |82) (Path 
2) with respective rate |I;) j=1,2), providing ener- 
gies for possible work. Finally, the electron re- 
turns to the primary electron donor via |/3;)(|G2)) 
— |g). As an alternative pathway to the two- 
step processes |a@1)(|@2)) —|G1)(|G2)) — |g), the 
acceptor-to-donor charge recombination can also 
be made by directly bringing the system back to 
the ground state |g) without producing any cur- 
rent with rates yTj=1,2, where y is a dimensionless 
fraction[2] describing the radiative recombination 
rate of the two pathways. 

Before we make our prediction, it is significan- 
t to gather a data set for the learning model to 
start with. In this work, the data used for ma- 
chine learning are collected from the initial stage 
of the PSII-RC mentioned above, which is gen- 
erated by a weak contact of the system with its 
environment. 

The unitary evolution of the electron transfer, 
as usual, can be described by an electronic Hamil- 
tonian 
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via a Lindblad-type master equation 


4 — 4, A Lupt Lich + Loci + Larch 
+Lpp + Lyrp, (2) 


where the strength of the interaction with the en- 
vironment is comparable with the internal inter- 
actions inside the system. And the last term in 
Eq. (1) shows the the dipole-dipole coupling be- 
tween |e1,2} is set as one unit. Then the Lindblad- 
type superoperators in Eq. (2) are listed as 
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Here Eq. (3) describes 
temperature reservoir with 


the effect of high- 
n, denoting its av- 
erage photon number. The low tempera- 
ture reservoir has the average phonon number 
lexp(“2 Fa) — 1)"' in Eq. (4), where 
Viie=Vie(V5je=jc) are the spontaneous decay rates 


Ne = 
from level |e2) to level |a;)(i = 1,2), respective- 
ly, and the cross-coupling Yije with Yije™=Vjice de- 
scribes the effect of Fano interference. Similar- 
ly, another low temperature reservoir is described 
by Eq. (5) with n= fexp(“2*) — 1]~! being 
the cold reservoir phonon number. Here Dj;-=T jic 
is defined as Tije = mVTicl'je with m denoting 
the quantum interference robustness to describe 
the Fano interference induced by the sponta- 
neous decay rates, De=Tie(V5je=T jc) (¢, j = 1, 2), 
In Eq. (6), 


)|-1 is the corresponding ther- 


from level |8; 1,2) to level |g). 

=lexp( F 
mal occupation number of photons at tempera- 
ture Ta. ap in Eqs. (7) and (8) is defined as 
Ĝan =la: (Qi |(2=1,2). Next, 1 million training data 
sets within 100 fs are collected from the densi- 
ty matrix element equations (see the Appendix) 


with parameters|2] listed in Tab. (1). 


B. Multi-optimized recurrent neural net- 
works(MRNN) 


In this work, the dynamic evolution of popu- 
lations on each energy level in Fig. 1 was utilized 
to illustrate EET evolution, and the long-term 
one in the proposed PSII-RC will be predicted us- 
ing a recurrent neural network (RNN) tuned by 
The simple RNN[21, 32] 
is a type of neural network that is designed to 


four hyperparameters. 


learn data sequences such as time series, as il- 
lustrated in Fig. 2. To further understand how 
they function, it is reasonable to compare them 
to more traditional feedforward neural network- 
s, in which the input data set is propagated step 
by step through multiple intermediary layers with 
the training performed by updating the weight 
matrices and the vectors consecutively until the 
final output layer is reached. In this way the 
neural network learns the possible input-output 


TABLE I. Parameters used in our numerical calculations. 


Values Units 
Ee, — Eg = Ee, — Eg 0.185 eV 
Ba = Egy = Ee = Pag 0.2 eV 
Eg, — Eg = Ep, — Eg 0.2 eV 
Yh 2.48 x 107 eV 
Ye 0.025 eV 
Yie = Y2e 0.012 eV 
rı =T>2 0.124 eV 
Tye = Pac 0.0248 eV 
Ta 0.026 eV 
Nh 6000 
Ne 0.46 
x 0.2 
nı 0.25 


correlations hidden behind the data, which can 
theoretically be used to handle temporal data. 


FIG. 2. The architecture of the simple recurrent neural network 


(RNN) model. 


However, a feedforward neural network is 
hardly the best solution since the number of free 
parameters rapidly grows with the number of time 
steps. As an alternative option, RNNs handle 
this issue by adopting a cyclic connection design 
in which the update rule for the hidden layers at 
time t is decided not only by the current state St, 
but also by the previous state (t— 1), as shown in 
Fig. 2, where x;(i = t—1,t,t+1), S; and O+, rep- 
resent the input sample data, the memory of the 


input sample, and the state information stored 
at the time t;, respectively. Such a RNN update 
rule is given by two activation functions f and g 
written as 


Si = f(Ux; + WS;_1), 
O: = g(V Sz), 


(9) 
(10) 


where the input sample weight U, the input 
weight W, and the output sample weight V are all 
time-independent matrices as the data set propa- 
gates forward. Here the activation functions f 
can be tanh, relu, sigmoid or other functions, 
while g is usually softmax and other function- 
s. As shown in Eq. (10), the previous state S1 
will participate the prediction of the current state 
Sı. In practice, using the three weight matrices, 
the input data and the hidden state at the previ- 
ous time step can be taken into consideration in 
generating the hidden state at the current time 
step[33, 34]. 

Due to these features of repeating operations, 
the Markovian assumption of independent data 
points at multiple time steps will become prob- 
lematic in the application of RNN. If a machine 
learning model favors noise and minutiae, and ig- 
nores the general trends and patterns in train- 
ing the data set, it will perform well on train- 
ing data but poorly on generating new data, a 
phenomenon termed overfitting. To avoid this, 
we use a simple RNN with a multi-optimizer 
to predict the quantum evolutions of EET in 
Besides, the multi- 


optimized hyperparameters are addressed further 


light-harvesting complexes. 


below. 


1. Learning rate regulator (LRR) 


In each iteration, usually a learning rate regu- 
lator (LRR) is fitted to the simple RNN in order 
to update the model parameters and determine 
how far down the gradient the parameter moves 


with each update[35, 36]. The exponential atten- 
uation regulator, cosine LRR, preheating regula- 
tor, and LR attenuation regulator are all com- 
monly used LRRs. A successful LRR should be 
able to optimize the model while avoiding overfit- 
ting and underfitting. If the LRR is set too high 
or too low, the model may diverge or converge s- 
lowly, leading to more training rounds before the 
optimal solution is obtained. 

To simplify the task, we define an LRR at the 
starting stage of training. During training, the 
model computes the gradient of the loss function 
so as to determine the updating direction of the 
parameters, and such a procedure is repeated and 
updated again and again during the dynamic pro- 
cess. The performance of the model is evaluated 
using both the training and verification sets. Be- 
cause the time series for the quantum evolutions 
of EET are projected to be long, a faster decaying 
LRR, i.e., an exponentially decaying LRR, 


LRR = 0.001 x exp(— ), (11) 
is employed to this RNN. Here epochis the param- 
eter reflecting the iterations during the training. 


2. Early stop function 


In addition to the previously stated anti- 
overfitting strategies, an early stop function|37] 
is employed to decrease the occurrence of overfit- 
ting. As implied by its name, before the algorith- 
m overfits, the early stop technique completes the 
training and obtains the optimal global outcome, 
resulting in a robust generalization performance, 
as shown by 


Evlt) := Miny <= Eya(t’), (12) 
_ va(t) 
GL(t) = 100 x [pad 6 =p (13 


where Fopt(t) is the ideal verification error set as a 
function of the number of repetitions t, and GL(t) 
is the generalization loss evaluating the rate at 
which the generalization error grows in compari- 
son to the previous lowest error. When the gener- 
alization error is large, an early stop is preferable 
since it indicates that the model has been fitted. 
Such an ending of training is judged by a thresh- 
old of GL(t). The early stop technique’s halting 
the first, 
second, and third. In this work, the first type of 


criterion is classified into three types: 


stop rule is employed to determine the loss func- 
tion and accuracy on the verification set. 


3. Regularization 


Regularization via adding penalty terms in- 
to the loss function of the model is becoming 
a popular strategy for reducing the overfitting 
and improving the model generalization in ma- 
chine learning. There are two types of regulariza- 
tion, namely L1 and L2[38, 39], evaluated by two 
weight parameters w* and w following and before 
the update, respectively. They are given by 


Wry = argminy{MSE(y, 


+3 leh 


(14) 


D w} J 


(15) 


Wie = argMiny{MSE(y, 


where the sum of the absolute values and the 
square of the parameters have been added to the 
loss function, respectively|[39]. The first terms on 
the right side of the two equations represent the 
original loss functions, the second terms represen- 
t the regularization parameters A, and the sum- 
mation is carried out over the numerous model 
ownership parameters. L1 regularization in E- 


q. (14) creates sparse solutions and is more suit- 


FIG. 3. The structural diagram of Recurrent Neural Network (a) 


without and (b) with Dropout regularization techniques. 


TABLE II. Hyperparameters used in the RNNs train- 


ing 


L2 LR epochs Dropout 
Paya, 0.01 0.0001 139 0.11747474747474748 
Pasay 0.01 0.0001 140 0.09779519760654846 
Pee, 9.01 0.0000001 155 0.32499999999999996 
Peze) 0.01 0.0000001 125 0.32499999999999996 
Po» 0.01 0.0000001 122 0.35 


able for feature selection, whereas L2 regulariza- 
tion Eq. (15) tends to produce a number of tiny 
but non-zero weights. In this study, L2 regular- 
ization is used to prevent overfitting in the long- 
term training data set. 


4. Dropout 


In this work, the Dropout regularization 
technique[40—42] is also applied to the simple RN- 
N. As the training of the input data is carried 
out forward through the neural network, the es- 
timated loss propagates backward, and the pa- 
rameters are changed using the gradient descent 
approach, as shown in Fig. 3(a). In Fig. 3(b), the 
Dropout parameter P is introduced to deactivate 
certain neurons with a specific probability dur- 
ing the forward propagation, and the parameters 
are updated using the gradient descent approach. 
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This method is repeated again and again so as to 
properly alleviate overfitting during the training 
process in this proposed MRNN. 


II. RESULTS AND DISCUSSIONS 


A. Training the multi-optimized recur- 
rent neural network (MRNN) 


After progressively incorporating a few hyper- 
parameters, such as early stop function, L2 regu- 
larization method, LRR, Dropout, and Bayesian 
optimizer, a multi-optimized RNN model is con- 
structed. As a distinguishing feature, this mod- 
el utilizes the Lindblad-type master equation E- 
q. (2) to gather data from the proposed PSH-RC, 
with parameters listed in Tab. (I). By using E- 
q. (2), we produce a data set of 1 million data 
points in 100 fs, which are divided into training 
and test sets at 4:1 ratio. The first 800,000 data 
points serve as the training set, while the remain- 
ing 200 000 act as the test set, using the hyper- 
parameters listed in Tab. (II). Fig. 4(a) displays 
the evolution of the training set fed into this pro- 
posed multi-optimized RNN learning model (O- 
riginal codes in SM1) during the interval [0, 80]fs. 


In order to assess the validity of the proposed 
learning model, we predict the evolution of EET 
(solid curves) over a time range from 80 to 100 fs 
with a comparison to the test set (dotted curves) 
collected from the aforementioned PSH-RC with- 
in the same range, indicating a good agreement 
This 
ensures the high accuracy of this learning model 


between them, as illustrated in Fig. 4(b). 


composed of the simple RNN and some optimiz- 
ers(Original codes in SM1). Supported by these 
results, the proposed multi-optimized RNN learn- 
ing model is expected to be able to anticipate the 
evolution of EET from 80 to 500 fs. 
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FIG. 4. (Color online) (a) Evolutions of the training population 
at all excited states within the interval[0, 80]fs. (b) A comparison 
of learning quality between our proposed learning model and the 
collected testing set over the range [80,100]fs. The hyperparameters 
listed in Tab.(II) and the original codes in SM1, SM2 and SM3 are 


used for the calculation. 


B. EET Predicted by the MRNN 


In Fig. 5, the vertical dashed black lines at 
the 80th fs shows the temporal starting point of 
our prediction achieved by gradually incorporat- 
The 
cut-off at 80 fs is used to evaluate the prediction 


ing hyperparameters into the calculation. 


accuracy by checking how precisely the data gen- 
erated by the prediction coincide with the train- 
ing data set. After examining Figs. 5(a)-(e), the 
evolutionary history of each population indicates 
that our predictions vary against hyperparame- 
ters in the following periods. L2 regularization, 
on the other hand, has little effect on EET pre- 
diction, as shown by the nearly identical curves in 
Figs. 5(a)-(e). When all the hyperparameters are 
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FIG. 5. 


different combinations of hyperparameters are shown in (a) to (e) 


(Color online) Prediction of the evolutive EET with 


via the multi-optimized RNN in the interval [80, 500]fs, and the 
predictive and theoretical (within [0, 500] fs) EET evolutions are 
demonstrated in (f) with all identical parameters to those in Fig.4 


included in the calculation. 


added into the simple RNN, there is a notable 
consistence between the training and prediction 
values at 80 fs, corresponding to the red curves in 
Figs. 5(a)-(e)(Original codes in SM2). 


Consider the prediction of EET on the excited 
state |e,) in Fig. 5(a), where the hyperparame- 
ters and optimizers are added one by one with- 
in the RNN architecture. Three layers with 128, 
64, and 32 neurons, respectively, are the essential 
characteristic neural network architecture. In the 
absence of optimizers and hyperparameters, the 
simple RNN timing prediction model fails to gen- 
erate substantial physical results, as the predic- 
tion (black dotted solid line) is not consistent with 
the training curve at 80 fs in Fig. 5(a). The time 
forecast equipped by the subsequently added ear- 
ly stop function does not make any difference to 
this inconsistence, as seen by the sky blue dot- 
ted solid curve overlapping that for simple RN- 
N. The overlapping purple and green curves in 
Fig. 5(a) indicate that the evolution of EET is 
not sensitive to the L2 regularization on the ex- 
cited state |e,), and almost the same evolution- 
ary properties over [80, 500]fs were exhibited by 
the insets in Fig. 5(a). The evolution curve of 
|e,) tends to level off when the Dropout parame- 
ter P=0.32499999999999996 is applied for further 
optimization, as demonstrated by the red curve. 


In Fig. 5(c), the red curve achieves per- 
fect docking with the 
80 fs with the Dropout parameter set as 
P=0.11747474747474748 and the initial LR= 
0.001. As it takes time to manually adjust the 
hyperparameters of the neural network when 


training data at 


optimizing the prediction of the dynamical 
population on |az), the Bayesian optimizer is 
implemented into the neural network to find the 
ideal number of layers and neurons. Finally, a 
neural network composed of 6.23223034545932 
layers with 124.19253861421566 neurons per 
layer was employed, corresponding to the red 


curve shown in Fig. 5(d). The same neural 


network architecture parameters were employed 
to forecast |b), and the inset in Fig. 5(e) clearly 
shows the roles each hyperparameter plays in 
predicting the evolution of pe». 
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FIG. 6. 


malized population sum in the interval [80, 500]fs. 


(Color online) Prediction accuracy defined by the nor- 


To validate the predictive accuracy of the pro- 
posed MRNN in this work, the evolution behav- 
iors of EET are theoretically simulated by using 
Eq. (2) in the temporal range of [0, 500] fs, as 
shown by the dotted lines in Fig. 5(f), while the 
optimal prediction results are re-plotted as dot- 
ted solid curves in the time range 80 to 500 fs, as 
shown by the red curves in Figs. 5(a)-(e) for com- 
parison(Original codes in SM1). It is found that 
the theoretical calculation and the curves predict- 
ed by MRNN for photosynthetic EET perfectly 
coincide, indicating the validity of our proposed 
MRNN. 

In order to quantitatively evaluate the predic- 
tion precision of EET for the system, an accuracy 
rate is defined as the normalized population sum 
during [80,500 fs]. Although the prediction pre- 
cision decreases against time for this photosyn- 
thetic system, it reaches a low of 0.9926 at 500fs, 
as shown by an arrow in Fig. 6(Original codes in 
SM1). Nevertheless, this accuracy outperforms 
most RNN learning models, such as those report- 
ed by Refs. [43, 44]. At the same time, because 
our MRNN learning model is born from the sim- 
ple RNN, long-term prediction inevitably will re- 
veal the inherent flaw of short-term prediction, re- 


sulting in a progressive drop in accuracy, as seen 
by the decreasing feature of Fig. 6. 


C. Polynomial fitting and analytical ex- 
pression for the predictive EET 


Using a polynomial to fit the predicted findings 
is a useful mathematical method for gaining an 
analytical knowledge of EET. Such a fitting usu- 
ally relies on the least squares method and lower- 
ing the error sum of squares[45, 46]. In Python, 
the Polynomial Features function is a utility in 
the scikit-learn library to generate the polynomi- 
al features so as to form a new feature matrix. 
Finally, the feature matrix is returned for further 
| 


model training and fitting. When a polynomial of 
order 2 is provided for a one-dimensional feature 
A, Polynomial Features generate a new feature 
matrix containing A to the first and second pow- 
ers. If the original feature has many dimensions, 
the resulting feature matrix will have power com- 
binations for each of them. 


Figure 7 displays the differences between the 
predicted curves and the fitting curves using Poly- 
nomial Features in scikit-learn library. A maxi- 
mum degree of order 5 for all polynomial fittings 
is adopted in Fig. 7 (Original codes in SM3). As 
a result, the analytical polynomial fittings cor- 
responding to to Figs. 7(a)-(e), respectively, are 
found to be 


Pe e, (t) = —0.0022184223655202433t + 1.3936336816327624 x 107%? 


—4.3658376701560287 x 10783 + 6.694506984131638 x 1071144 


—4.,008815848721703 x 107 144° + 0.33721254828307511, 


Pezes (t) = 0.0055891350945578t — 3.50638450439065 x 10754? 


+9.953236151938327 x 107145 + 0.24964385363485108, 


Paia, (t) =4.317228690621229 x 107'°t — 3.0372753727222955 x 1071? 


+1.0249801765297704 x 10~7°¢° + 0.00000002169596502, 


Paza (t) = —1.8447722039325343 x 10~°t + 1.0562606143359824 x 10782? 


—2.4712186236144212 x 107175 + 0.00021705673079966, 


pov(t)= —0.003319998227995093t + 2.0476938571365984 x 107 °t? 


(16) 
+1.091124267760133 x 1077: — 1.6659867620310589 x 107!°¢4 

(17) 
+1.0210132754732106 x 10714: — 1.647224363731816 x 107174 

(18) 
—3.040707800147938 x 10711: + 4.356557804662399 x 107144 

(19) 
—6.313410066086038 x 1078Ł + 9.606627816067419 x 1071144 

(20) 


—5.735689700969715 x 10714: + 0.41192131944573884, 


Where peje1; Peres Pasar, aNd py agree very well 
with the predicted curves, as shown in Fig. 7. In 
contrast, the fitting result of Paia; is rather poor, 
as shown in Fig. 7(c), where overfitting is visi- 
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ble in its dynamic population progress, indicat- 
ing that an alternative fitting technique should 
be found for it. 


Because the characteristic features of an RN- 
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FIG. 7. 


in comparison with EET evolution predictions (blue dash-dotted 


(Color online) Polynomial fittings (red dotted curves) 


curves) within the interval of [80, 500]fs. 
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N lies in its internal (hidden) loop memory, it is 
understandable that a dynamic state contains al- 
l the information about the previous input as it 
evolves through the data sequence, as shown in 
Fig. 2. If the polynomial order is increased, the 
polynomial features will over adapt to the past 
information but perform poorly on the fresh da- 
ta. This enlarges both amplitude and rate of the 
change, leading to overfitting. An effective solu- 
tion is to derive information based on not just 
purely the previous step, but all the previous in- 
puts, by implementing a time-segment strategy. 
Here polynomial fittings in different time ranges 
are carried out and a unit step function 


t 


is incorporated to the final polynomial expression 


t < 250 
t >= 250 ` 


t < 250 


flt) t >= 250 


in each time-segment. The fitting time is divid- 
ed into two intervals [80, 250] fs and [250, 500] fs, 
and the polynomial fitting in each interval is mul- 
tiplied by a unit step function given by Eq. (21). 
Their final fitting is the sum of the two polyno- 
mial fittings, i.e. 


Paia: (t80-500) = Para; (tso-250) * fı (t) 


+para; (t250-500) * fo(t). (22) 


The curves in Fig. 8 shows the fitting result- 
s for the population on the state |a,) using the 
aforementioned procedure (Original codes in S- 
M3). The two insets in Fig. 8 show a compari- 
son between polynomial fitting and predictions in 
the intervals [80,250]fs and [250,500] fs. respec- 
tively. The overlapping between the dotted lines 
(red) and dash-dotted lines (blue) demonstrates 
the high precision of polynomial fitting at various 
intervals. Furthermore, the analytical functions 
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FIG. 8. (Color online) A comparison between the fittings by the 
time-segment polynomial multiplied by unit step function (red dot- 
ted curves) and the evolution predictions (blue dash-dotted curves) 
in [80, 500]fs. Time-segment polynomial fittings in comparison with 
the evolution predictions shown by the insets(al) in [80, 250]fs and 


(a2) in [250, 500]fs. 


derived by the time-segment polynomial fittings 
are, respectively, given by 


Paa, (tgo—250) = 1.7388382452293818 + 107 °t 
—1.8538908793974676 * 107114? 

+9.883817236997048 + 107448 

—2.6256355830257586 * 107 1°44 


+2.7737661365102853 x 107195, (23) 
Para, (t250-500)= 2.500773638209706 x 107 "°t 
—1.2744394675654058 * 1071? 
+3.227678944102217 « 10717: 
—4.0616638358969016 * 107 2°44 


+2.031479651834943 x 107345. (24) 


The comparison made between the whole pe- 
riod fitting and evolution prediction in [80, 500]f- 
s for the dynamics populations on state |œ) 
demonstrates that the time-segment polynomi- 
al fitting can well overcome the overfitting in 
Fig. 7(c). 

Figure (9) depicts the total fitting loss rate 
compared to the predicted results, a physical 
quantity assessing the precision of the fitting tech- 
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FIG. 9. 


versus the prediction within the range of [80, 500]fs. 


(Color online) Total loss rate of the polynomial fitting 


nique employed in this work. It is found that even 
though a jumpy loss rate can be seen both at the 
initial and the final stages of the time interval, 
the loss rate exhibits an oscillating behavior of 
an order of 10~°, ensuring a high accuracy for 
this MRNN when utilized in this work. 

So far, the above fitting findings have accu- 
rately described the EET evolution, revealing the 
physical rules behind data, which may be utilized 
to purposely control EET and construct artificial 
photosynthetic devices in the future. 


IV. CONCLUSION AND OUTLOOK 


In summary, fed by the original PSH-RC data 
set, a MRNN strategy is proposed to forecast the 
EET evolution, with an accuracy of over 99.26% 
within 500fs if compared to the theoretical deduc- 
tion. The polynomial fitting is also implemented 
for the EET evolutions so as to get analytical 
results. The predicted EET evolutions are al- 
so subjected to time-segment polynomial fitting 
multiplied by a unit step function, and the ana- 
lytical formulations showing a high precision with 
the loss rate of an order of 10~° demonstrate the 
closeness to physical law. The results reveal that 
the proposed MRNN is a valid and powerful data 
mining tool for forecasting the evolution of EET 


in a light-harvesting complex. A comparison to 
experimental data in the future is expected so as 
to assess the validity of this learning model. 

is obtained. 
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VI. APPENDIX 


The density matrix dynamic element equations 
are given by 


[2(nm1. T 1) Benes — NicPpazaz T NicPaiar] 


perei = —Yel(Ne + 1)Perei — Nepesea]| yal + 1)Pereı — NhPggl, 
pezes = Velie |) pate, — Nepeses| — Vicl(Nic + 1)Pezez = Nei Pe | 

=Von[ (Mie? L)Peze2 — Nicpazaz] + 2a RelPoaz], 
Paras = Yiel(M1e + 1) Peres — NicPoroi] — Vi2cNicRelparaz] — (1+ AT pares, 
Poxa2 = Viel (Mie ae dpe = MePeie| = Yr2en1cRe| Paar] = (1 pa AJL Spates, 
; 1 1 
Poi as = tA Paras — 5 (Ne + YV2c)NicParas + z2 
Phib = Viparar — Viel(M2e + 1) p14, — N2chgg] — Viac(Mae + 1) Refps p], 
P Bobo = V2Par02 — Poe[(M2e + 1) P60 a N2cPgg] —TVy2e(M2e + 1) Re|pg, 62], 
PB B2 = —IA2P6 b2 — Ti + Poe) (n2 + 1)p 8,6. = 


Pog = 1 Perei — Peres — Parai — Pores — PBB. — 


where A; = Ey, — Ea, and Ag = Eg, — Eg, are the 
splitting of the states |a,)(|a2)) and |G;)(|G)). 
We utilize the equations to simulate dynamics of 
EET in PSII-RC. 
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1 
gT izel(rac + 1) 6164 + (Me + 1) P66. g 2N2cPgg]; 
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